#Model Adaptation23/04/2025
Revolutionizing LLMs: Self-Evolving Language Models Learn Without Labels Using Test-Time Reinforcement Learning
Researchers from Tsinghua University and Shanghai AI Lab introduce TTRL, a novel method allowing large language models to improve their performance without labeled data by leveraging self-generated pseudo-rewards during inference.